Conversation
The field has been declared on engine.Options since the file was created (c567135, 2024-05) but was never read by Render or any helper. apply.go and template.go were assigning it from their respective --insecure flags; apply_test.go asserted the assignment. Nothing else looked at the value. The runtime guarantee — --insecure (maintenance mode) bypasses FailIfMultiNodes — is provided by WithClientMaintenance not injecting nodes into the gRPC context, not by anything inside engine.Render. The dead field added nothing on top of that. Drop the field, the assignments, and the assertion. golangci-lint is clean. Closes #123 Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
helpers.FailIfMultiNodes(ctx, name) embeds the name in its error message. The call site in engine.Render hardcoded "talm template", which was accurate when Render was only called from the template subcommand. PR #119 made apply call Render too, so users running `talm apply` with a multi-node modeline saw an error talking about `talm template` — confusing. Add Options.CommandName, default to "talm" when empty, set "talm apply" in apply's buildApplyRenderOptions and "talm template" in template's option-build. TestRenderFailIfMultiNodes_UsesCommandName covers both subcommands plus the empty-string fallback and explicitly asserts the historical "talm template" no longer leaks into the apply case. Closes #121 Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
…turns
The per-file loop in apply had two branches with asymmetric error
handling:
- template-render: `err = withApplyClient(...)` (assigning the outer
loop variable) followed by an outer `if err != nil { return err }`
- direct-patch: `if err := withApplyClient(...); err != nil { return err }`
(local scope, immediate return)
The outer check only caught errors from the template-render branch.
Any future tweak to the direct-patch branch could silently swallow an
error — exactly the shadowing class that already bit the project once
(commit b34781e).
Switch the template-render branch to the same local-scope idiom and
delete the outer check. Both branches now read identically and there
is no dead code path to misuse.
Closes #122
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
Before: when a node file's modeline contained `templates=[...]`, both
`talm apply -f node.yaml` and `talm template -f node.yaml` rendered
the listed templates and discarded the rest of the node file. Per-node
hostname, secondary interfaces with deviceSelector, VIP placement, etcd
extraArgs, certSANs and time servers all silently vanished — a
multi-NIC control plane could not bootstrap because the etcd
advertised-subnet interface was missing from the applied config.
Add engine.MergeFileAsPatch using Talos configpatcher (LoadPatches +
Apply). Apply the node file as a strategic merge patch over the
rendered template in both:
- apply.go template-rendering branch (after engine.Render, before
c.ApplyConfiguration)
- template.go templateWithFiles (after generateOutput, before
inplace-write or stdout)
Same merge step in both keeps the documented piped flow
`talm template -f X | talm apply -f -` carrying the node body through
end to end.
A modeline-only node file (no Talos config body) becomes a patch with
empty content; LoadPatches still returns a patch object, but Apply has
nothing to merge — every rendered field round-trips through the
configpatcher's loader unchanged.
TestMergeFileAsPatch covers both paths: the body-overlay case asserts
the custom hostname and the deviceSelector secondary interface land in
the merged output and the auto-generated hostname is gone; the
modeline-only case asserts every rendered field survives.
Closes #126
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
PR #119 moved talm apply's template rendering from offline to online. The new online path runs inside withApplyClient, whose wrapWithNodeContext helper batches every node from GlobalArgs.Nodes into a single gRPC context. engine.Render then calls helpers.FailIfMultiNodes which rejects multi-node contexts, so a node file with `nodes=[ip1, ip2]` (or `--nodes A,B,C` on the command line) fails before any rendering happens. The pre-#119 direct-patch flow handled multi-node fan-out at the wire level inside ApplyConfiguration, so this never surfaced. Render once per node instead. Add applyTemplatesPerNode that takes a nodes slice and injection points for the render and apply functions, then iterates: for each node it builds a single-node context with client.WithNodes, calls engine.Render (which now sees one node and satisfies the FailIfMultiNodes guard), merges the modeline'd file as a patch, and applies the result. Per-node iteration is also the correct semantic — discovery via lookup() resolves each node's own network topology rather than mashing everything together. Split withApplyClient into a public form (still wraps with the legacy multi-node context for the direct-patch branch) and withApplyClientBare which skips the wrap so the per-node loop can attach contexts itself. Three tests pin the contract: that each render and apply call sees exactly one node in its outgoing-context metadata; that engine.Render with a batched multi-node context still errors (the pre-condition the loop exists to satisfy); that an empty nodes slice errors loudly rather than silently doing nothing. Manually verified the loop assertion fails when the per-iteration context is built with all nodes instead of one. Closes #120 Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
…ure mode
Branch review caught four blockers and three majors in the original
implementation. Fixes:
- engine.MergeFileAsPatch now short-circuits modeline-only patch files
(BLOCKER 2). The Talos config-patcher misclassifies a YAML-comment-only
file as an empty JSON6902 patch, then refuses any multi-document
rendered config (the v1.12+ output format) with 'JSON6902 patches are
not supported for multi-document machine configuration'. Detect 'no
meaningful YAML content' before LoadPatches and return rendered
unchanged in that case. Fix also covers empty files and files with only
comments / document separators / whitespace.
- talm template no longer routes its output through MergeFileAsPatch
(BLOCKER 1, BLOCKER 5, MEDIUM 8). The patcher round-trips through its
config loader, normalising YAML formatting, sorting keys, and (worst
of all) stripping every YAML comment — which deletes the talm modeline
('# talm: ...') and the AUTOGENERATED warning that downstream commands
rely on. Removing the template-side merge keeps stdout and --in-place
outputs intact. The merge stays in apply because that output goes
straight to ApplyConfiguration where formatting and comments do not
matter.
- The pipe-flow comment that motivated the template-side merge is gone
(MAJOR 6). talm apply does not in fact accept stdin via '-f -'
(ExpandFilePaths only handles real paths), so the documented workflow
was a fiction.
- Insecure (maintenance) mode now opens a fresh single-endpoint client
per node (BLOCKER 4). WithClientMaintenance creates one client with
every endpoint and gRPC then round-robins ApplyConfiguration across
them, so most nodes never received the config. Narrow GlobalArgs.Nodes
to the iteration's node and restore it via defer; this requires
splitting applyTemplatesPerNode's client-acquisition into an injected
openClientFunc with two production implementations
(openClientPerNodeMaintenance, openClientPerNodeAuth).
- TestApplyTemplatesPerNode_BatchedContextIsRejected was renamed and
rewritten as TestApplyTemplatesPerNode_NeverBatchesNodes, which
asserts the loop itself never produces a multi-node ctx (MAJOR 7).
The previous test happened to call engine.Render and pass; it would
not have caught a regression in applyTemplatesPerNode.
- Two new tests cover the insecure path
(TestApplyTemplatesPerNode_MaintenanceModeOpensFreshClientPerNode and
TestOpenClientPerNodeMaintenance_NarrowsAndRestoresGlobalNodes).
TestMergeFileAsPatch grew three sub-tests covering multi-doc patching,
empty files, and comments-only files. All five new tests fail without
the corresponding fix.
- Doc comments cleaned up (MINORS 9, 10, 11): the FailIfMultiNodes
rationale lives in one place (applyTemplatesPerNode); MergeFileAsPatch
wraps the out.Bytes() error with file context; withApplyClient and
withApplyClientBare now describe what they actually do.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
…ferences - engine.MergeFileAsPatch now reads the patch file once (was twice — one ReadFile and one LoadPatches with @-syntax) by switching from configpatcher.LoadPatches to LoadPatch on the already-read bytes. No TOCTOU window between the empty-detection check and patch loading. - New TestMergeFileAsPatch sub-test covers the realistic Talos v1.12+ apply scenario the previous test set missed: multi-document rendered config (legacy machine/cluster doc plus typed HostnameConfig and LinkConfig docs) overlaid by a non-empty strategic-merge patch in the node file. The assertion checks that the legacy machine doc absorbs the patch and the sibling typed docs survive untouched. - New TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation pins the intentional divergence between talm template and talm apply for files with a non-empty body. Apply overlays the body via the config-patcher before sending; template emits the rendered bytes verbatim so the modeline and the AUTOGENERATED-warning comment survive in stdout/in- place output. A future patch that tries to make template match apply will trip this test. - openClientPerNodeAuth now uses client.WithNode (single-target metadata) instead of client.WithNodes (aggregation metadata). The per-iteration intent is one node, not a collection of one — naming matches semantics. Test helper nodesFromOutgoingCtx reads either key so the assertions stay valid. - Removed all internal-tracker references and review-round annotations from test docstrings. Tests are now self-describing: each one names the contract it pins, not the issue ticket that motivated it. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
- Sync openClientFunc docstring with the WithNode (single-target) metadata key the auth-mode opener actually uses; the old docstring still mentioned WithNodes from the previous version. - TestOpenClientPerNodeMaintenance_NarrowsAndRestoresGlobalNodes now drives the production openClientPerNodeMaintenance with an injected fake maintenanceClientFunc instead of an inline stub. The fake snapshots GlobalArgs.Nodes at the moment WithClientMaintenance reads it, proving the narrow-and-restore-via-defer logic is exercised end to end. A regression in the production function now fails this test (previously it would have continued to pass against the test-local stub). - New TestApplyTemplatesPerNode_AuthModeUsesSingleNodeMetadataKey pins the gRPC metadata key the auth-mode opener writes so a future swap back to WithNodes (which would round-trip through nodesFromOutgoingCtx unnoticed) gets caught. - README documents that node files can carry per-node patches in their body, that talm apply applies them as a strategic merge over the rendered template, and that talm template intentionally does not. Recommends apply --dry-run for previewing the exact bytes apply will send. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
CommandName is read by engine.Render to format the FailIfMultiNodes error wording. The engine-side assertion (TestRenderFailIfMultiNodes_UsesCommandName) verifies Render does the right thing with whatever value it receives, but nothing was pinning that buildApplyRenderOptions sets it to "talm apply" in the first place. A future refactor that drops the field would slip through unnoticed. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughThe changes implement per-node template rendering with merge patch semantics in Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Apply as apply.go
participant Client as Talos Client
participant Engine as engine.go
participant Node as Talos Node
User->>Apply: talm apply -f node.yaml<br/>(with templates= in modeline)
loop Per-node (applyTemplatesPerNode)
Apply->>Client: Open per-node client<br/>(maintenance or auth mode)
Apply->>Engine: Render(ctx, opts)<br/>render template
Engine-->>Apply: rendered bytes
Apply->>Engine: MergeFileAsPatch<br/>(rendered, node.yaml)
Engine->>Engine: Load patch file<br/>Apply strategic merge
Engine-->>Apply: merged bytes
Apply->>Client: ApplyConfiguration(merged)
Client->>Node: Apply merged config
Node-->>Client: OK
Client-->>Apply: Success
end
Apply-->>User: Applied to all nodes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors the talm apply template-rendering path to process nodes individually, enabling per-node configuration overlays via strategic merge patches. It introduces the MergeFileAsPatch utility and adds comprehensive tests for both authenticated and maintenance modes. Review feedback suggests ensuring node resolution falls back to the Talos context when CLI arguments are missing, supporting multi-document patches in node files, improving error messaging for missing nodes, and optimizing performance by moving file I/O for patches outside the per-node loop.
| if err := withApplyClientBare(func(parentCtx context.Context, c *client.Client) error { | ||
| openClient := openClientPerNodeAuth(parentCtx, c) |
There was a problem hiding this comment.
In the authenticated template rendering path, the list of nodes is captured from GlobalArgs.Nodes before the client is created. Unlike the direct patch path (which uses wrapWithNodeContext), this path does not fall back to the nodes defined in the talosconfig context if GlobalArgs.Nodes is empty. This means talm apply will fail if nodes are not explicitly provided via CLI or modeline, even if they are defined in the current Talos context.
if err := withApplyClientBare(func(parentCtx context.Context, c *client.Client) error {
if len(nodes) == 0 {
if configContext := c.GetConfigContext(); configContext != nil {
nodes = configContext.Nodes
}
}
openClient := openClientPerNodeAuth(parentCtx, c)There was a problem hiding this comment.
Added the same fallback in 5a16ad3. Extracted resolveAuthTemplateNodes so the precedence (CLI/modeline first, talosconfig context second) is unit-testable and symmetric with what wrapWithNodeContext already did for the direct-patch branch. Insecure (maintenance) mode still uses only GlobalArgs.Nodes — the maintenance client talks to node IPs directly and carries no talosconfig context to consult.
| patch, err := configpatcher.LoadPatch(patchBytes) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("loading patch from %s: %w", patchFile, err) | ||
| } | ||
| out, err := configpatcher.Apply(configpatcher.WithBytes(rendered), []configpatcher.Patch{patch}) |
There was a problem hiding this comment.
configpatcher.LoadPatch is used to load the patch from the node file. However, LoadPatch typically only handles a single YAML document. Since Talos v1.12+ and talm templates often produce multi-document configurations, users might need to provide multi-document patches in their node files to target different parts of the configuration. Using configpatcher.LoadPatches (plural) would be more consistent with other parts of the codebase (like FullConfigProcess) and provide better support for complex configurations.
| patch, err := configpatcher.LoadPatch(patchBytes) | |
| if err != nil { | |
| return nil, fmt.Errorf("loading patch from %s: %w", patchFile, err) | |
| } | |
| out, err := configpatcher.Apply(configpatcher.WithBytes(rendered), []configpatcher.Patch{patch}) | |
| patch, err := configpatcher.LoadPatches([]string{string(patchBytes)}) | |
| if err != nil { | |
| return nil, fmt.Errorf("loading patches from %s: %w", patchFile, err) | |
| } | |
| out, err := configpatcher.Apply(configpatcher.WithBytes(rendered), patch) |
There was a problem hiding this comment.
Checked the Talos source — configpatcher.LoadPatch already routes through configloader.NewFromBytes first, which handles multi-document Talos configs as a strategic-merge patch. A multi-doc node body therefore already works today; configpatcher.LoadPatches([]string{string(patchBytes)}) with a single-element slice would resolve to the same LoadPatch call and add no capability. LoadPatches only splits its input when the caller supplies multiple strings (typically @file references), not by splitting a single string on ---. Keeping LoadPatch here.
| apply applyFunc, | ||
| ) error { | ||
| if len(nodes) == 0 { | ||
| return fmt.Errorf("no nodes specified for template-rendering apply") |
There was a problem hiding this comment.
The error message when no nodes are specified is less descriptive than the one used in the template command. Providing a consistent and helpful error message that suggests how to fix the issue (e.g., using the --nodes flag or modeline) would improve the user experience.
| return fmt.Errorf("no nodes specified for template-rendering apply") | |
| return fmt.Errorf("nodes are not set for the command: please use '--nodes' flag or configuration file to set the nodes to run the command against") |
There was a problem hiding this comment.
Reworded in 599079d to name the three concrete ways to set nodes (--nodes, modeline, talosconfig context) and strengthened TestApplyTemplatesPerNode_NoNodesIsAnError to assert the guidance stays put through future rewords. With the talosconfig-context fallback from 5a16ad3 this error now only fires when all three sources are empty.
| // ApplyConfiguration) but unsuitable for human-facing output such as | ||
| // `talm template` — which is why the template subcommand does not call | ||
| // this helper. | ||
| func MergeFileAsPatch(rendered []byte, patchFile string) ([]byte, error) { |
There was a problem hiding this comment.
The MergeFileAsPatch function reads the patchFile from disk and performs an 'effectively empty' check on every call. Since this function is called within a loop over nodes in applyTemplatesPerNode (via renderMergeAndApply), it leads to redundant file I/O and processing. For large clusters, this could become a performance bottleneck. Consider refactoring this to load the patch data once outside the per-node loop and pass the pre-loaded data or patch object to the rendering logic.
There was a problem hiding this comment.
After the multi-node + non-empty body guard in 66a9489, any non-empty patch is guaranteed to target exactly one node — so MergeFileAsPatch reads and parses the file once per apply. For a single-node call the read/parse is ≤1 ms and comfortably dominated by template render and gRPC apply; a refactor to split file I/O from the merge would pay for itself only on large node-body files applied to hundreds of nodes, which the new guard forbids. Happy to revisit if a use case emerges that the guard does not cover.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/commands/apply.go`:
- Around line 251-265: applyTemplatesPerNode currently re-uses a single
configFile body for every node and calls renderMergeAndApply for each target,
which causes node-specific overlays (hostname, addresses, VIPs) to be applied to
all nodes; update applyTemplatesPerNode to detect when the provided configFile
contains a non-empty overlay and, if len(nodes) > 1, reject the operation with a
clear error, or alternatively resolve/split the overlay per target before
entering the for loop so each openClient call uses a node-specific config;
ensure the same change/error check is applied to the other similar call site
that invokes renderMergeAndApply (the second occurrence referenced in the
comment).
In `@README.md`:
- Around line 143-158: Adjust the paragraph about "Per-node patches inside node
files" to avoid claiming legacy node-body fields (e.g., deviceSelector
interfaces, VIPs, machine.network.interfaces) survive talm apply for Talos
v1.12+ multi-doc output: either restrict the statement to legacy/single-doc
template behavior only, or explicitly add that for v1.12+ multi-doc mode users
must patch the typed resources
(LinkConfig/BondConfig/VLANConfig/Layer2VIPConfig) because the multi-doc path
intentionally ignores legacy machine.network.interfaces due to lack of a safe
1:1 translation; mention talm apply -f node.yaml and talm template -f node.yaml
where appropriate to guide readers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d8100954-d5ef-4b44-96b9-51a4d161b29a
📒 Files selected for processing (6)
README.mdpkg/commands/apply.gopkg/commands/apply_test.gopkg/commands/template.gopkg/engine/engine.gopkg/engine/render_test.go
| func TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation(t *testing.T) { | ||
| const rendered = `version: v1alpha1 | ||
| machine: | ||
| type: controlplane | ||
| network: | ||
| hostname: talos-abcde | ||
| ` | ||
| dir := t.TempDir() | ||
| nodeFile := filepath.Join(dir, "node.yaml") | ||
| const nodeBody = `# talm: nodes=["10.0.0.1"] | ||
| machine: | ||
| network: | ||
| hostname: node0 | ||
| ` | ||
| if err := os.WriteFile(nodeFile, []byte(nodeBody), 0o644); err != nil { | ||
| t.Fatalf("write node file: %v", err) | ||
| } | ||
|
|
||
| // apply path: renders, then overlays. | ||
| merged, err := engine.MergeFileAsPatch([]byte(rendered), nodeFile) | ||
| if err != nil { | ||
| t.Fatalf("MergeFileAsPatch: %v", err) | ||
| } | ||
| if !strings.Contains(string(merged), "hostname: node0") { | ||
| t.Errorf("apply path must overlay hostname: node0; got:\n%s", string(merged)) | ||
| } | ||
|
|
There was a problem hiding this comment.
🟡 The 'template path' block in TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation contains two assertions that are permanently vacuous: templateOutput := []byte(rendered) simply copies the local constant, so bytes.Equal(templateOutput, []byte(rendered)) is always true (the negated check never fires) and strings.Contains(string(templateOutput), "hostname: node0") is always false because the rendered constant contains hostname: talos-abcde. The test comment claims this block will catch a future 'make template match apply' regression, but no production template code is exercised, so such a regression would pass undetected. The apply-path half of the test (calling engine.MergeFileAsPatch) is legitimate; only the template-path assertions are hollow.
Extended reasoning...
What the bug is and how it manifests
In TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation (pkg/commands/apply_test.go), the 'template path' block is supposed to pin the invariant that talm template does NOT overlay the node-file body onto the rendered output. Instead, the block performs no actual test: it creates templateOutput as a direct byte-copy of the rendered constant and then asserts two things that are structurally impossible to fail.
The specific code path that triggers it
Lines 543-569 of apply_test.go (template-path block):
Assertion 1: bytes.Equal(templateOutput, []byte(rendered)) compares rendered to itself. The result is always true, so the negated guard is always false and t.Error is never reached.
Assertion 2: strings.Contains(string(templateOutput), "hostname: node0") searches for node0 inside rendered, which contains hostname: talos-abcde — never node0. This condition is always false and t.Error is never reached.
Why existing code does not prevent it
Neither the compiler nor go test has any way to detect that a predicate is structurally always-false. The test compiles and passes on every run, regardless of what generateOutput(), templateWithFiles(), or any related production code does.
What the impact would be
If a future contributor makes talm template call engine.MergeFileAsPatch (the exact regression the test comment warns about), both assertions would still pass. The first passes because templateOutput is still the copy of the constant rendered, not the output of generateOutput; the second passes because rendered still does not contain node0. The test gives false confidence in the template/apply divergence invariant.
How to fix it
The template-path block should actually call the production template code — specifically generateOutput() (or an equivalent helper) — capture its output as templateOutput, and then assert the two properties. That way a change to generateOutput that starts applying the overlay will cause hostname: talos-abcde to be replaced by hostname: node0 in the output, making the second assertion fire as intended.
Step-by-step proof
renderedis the constant string containinghostname: talos-abcde.templateOutput := []byte(rendered)creates a byte slice with exactly the same content asrendered.bytes.Equal(templateOutput, []byte(rendered))— both sides are identical byte sequences → always returnstrue.- The guard
\!bytes.Equal(...)is alwaysfalse→ the firstt.Erroris unreachable on every run. strings.Contains(string(templateOutput), "hostname: node0")searches fornode0in a string that only containstalos-abcde→ always returnsfalse.- The second
t.Erroris also unreachable on every run. - Both assertions pass permanently regardless of what the production template code does, providing only false confidence in the divergence invariant.
There was a problem hiding this comment.
Fixed in 294b4ce. You're right — templateOutput := []byte(rendered) made both assertions structurally vacuous. Deleted the vacuous block and left a comment explaining that the no-overlay invariant for talm template is structural rather than runtime (pkg/commands/template.go never calls engine.MergeFileAsPatch, and a regression that wired the merge into generateOutput would surface in the modeline round-trip tests — the assert-against-the-constant pattern cannot pin it). The apply-path half of the test keeps its meaningful MergeFileAsPatch assertion.
Address review feedback from gemini-code-assist on pkg/commands/apply.go: the authenticated template-rendering path captured nodes from GlobalArgs.Nodes and never consulted the client config context, so a user who had already run 'talosctl config node <ip>' still had to repeat --nodes on every talm apply. The direct patch branch already consults the context via wrapWithNodeContext — this aligns the template branch with the same behavior. Extract resolveAuthTemplateNodes so the precedence (CLI/modeline beats talosconfig context) is testable in isolation. Insecure mode deliberately opts out: maintenance clients connect to node IPs directly and carry no talosconfig context, so GlobalArgs.Nodes remains the only source there. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Address review feedback from gemini-code-assist on pkg/commands/apply.go:260. The terse 'no nodes specified for template-rendering apply' did not tell the user how to fix the problem. Use the same phrasing style as the template command so the message names the concrete ways to set nodes (--nodes flag, node file modeline, talosconfig context). Strengthen TestApplyTemplatesPerNode_NoNodesIsAnError to assert the guidance survives a cosmetic reword but catches a regression that drops it. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Address review feedback from coderabbitai on pkg/commands/apply.go: applyTemplatesPerNode renders and applies per target, but renderMergeAndApply merges the same configFile body on every iteration. A node file whose modeline targets multiple nodes with a non-empty body (pinned hostname, address, VIP, etc.) therefore stamped the same value onto every machine — the opposite of what the overlay feature is for. Reject the combination up-front in applyTemplatesPerNode: when more than one target is scheduled and engine.NodeFileHasOverlay reports the file carries a real body below its modeline, surface an error that names the file, the targets, and the remediation (split into one file per node or drop the per-node fields). Modeline-only files remain allowed — they are the common bootstrap shape where the same rendered template lands on N nodes. Expose engine.NodeFileHasOverlay so the command layer can peek at the file without reading it twice or duplicating isEffectivelyEmptyYAML. Add regression tests for both the rejected and allowed shapes, plus a standalone table test for NodeFileHasOverlay. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Address review feedback from coderabbitai on README.md: the per-node-patches note promised that legacy node-body fields such as deviceSelector interfaces and VIPs survive talm apply, but on the Talos v1.12+ multi-doc path machine.network.interfaces fragments have no safe 1:1 mapping to LinkConfig/BondConfig/VLANConfig/ Layer2VIPConfig, so they do not translate. As written, the text misled v1.12+ users into relying on overrides that are not represented semantically the way the docs implied. Add an explicit v1.12+ caveat that tells users to patch the typed resources for per-node network settings, and call out the new one-body-one-node guard (talm apply now refuses a multi-node modeline with a non-empty body). Fields outside the network area still merge as before. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Address review feedback on apply_test.go: the 'template path' block of TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation compared the rendered constant against itself and searched for a string it could not contain, so both assertions passed regardless of what production code does — a future regression that wired engine.MergeFileAsPatch into generateOutput would still sail through. Delete the vacuous block and leave a comment explaining that the no-overlay invariant for talm template is structural (template.go never calls MergeFileAsPatch) and enforced by the modeline round-trip tests rather than a runtime assertion here. The apply-path half (MergeFileAsPatch actually merges) keeps its meaningful assertion. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
myasnikovdaniil
left a comment
There was a problem hiding this comment.
Now I have enough context. Drafting the review.
PR 128 Review Draft
Summary
This PR fixes a real, shipping regression introduced by #119: the template-rendering branch of talm apply batched all nodes into a single gRPC context, which engine.Render rejected via FailIfMultiNodes before any rendering happened. The fix iterates per node — single-node ctx in auth mode (client.WithNode) and a fresh single-endpoint client in maintenance mode (because WithClientMaintenance ignores per-call node metadata and would round-robin ApplyConfiguration across all endpoints otherwise). The companion fixes (overlay-merge of the node file body via engine.MergeFileAsPatch, error-message subcommand thread-through, error-handling symmetry, and the dead Options.Insecure removal) are tightly scoped and all sit on the same callback. Tests are comprehensive and the README clearly documents the apply/template divergence and the Talos v1.12+ caveat. I found no blockers; comments below are mostly nits.
Nit — test names
Tests like TestApplyTemplatesPerNode_LoopsOncePerNodeWithSingleNodeContext and TestOpenClientPerNodeMaintenance_NarrowsAndRestoresGlobalNodes are descriptive but very long. t.Run subtests with shorter parent names would read better in go test -v output. Style call.
Verification I performed
- Traced
client.WithNode(pkg/machinery/client/context.go:30) — confirmed it deletes thenodesmd key and setsnode, soWithClientSkipVerify's pre-injected multi-node ctx is correctly narrowed byopenClientPerNodeAuth. TheTestApplyTemplatesPerNode_AuthModeUsesSingleNodeMetadataKeytest validates this. - Traced
helpers.FailIfMultiNodes— confirmed it only inspects thenodesmd key, notnode, so single-target metadata passes the guard trivially. - Traced
configpatcher.LoadPatchandJSON6902— confirmed an effectively-empty patch is decoded as an emptyjsonpatch.Patch, andJSON6902then rejects multi-doc input withJSON6902 patches are not supported for multi-document machine configuration. The empty-content short-circuit inMergeFileAsPatchis genuinely necessary for the v1.12+ multi-doc default output. - Traced
Args.WithClientMaintenance— confirmed it readsc.Nodes(=GlobalArgs.Nodes) synchronously inside the same goroutine before callingaction, so the narrow-and-restore pattern is correct under today's Talos client semantics.
Recommendation
Comment — approve once the maintainer decides on the fail-fast-vs-multierror call (or chooses to document it and move on). Everything else is cleanup-grade.
| // applyTemplatesPerNode for why the loop is mandatory. | ||
| opts := buildApplyRenderOptions(modelineTemplates, withSecretsPath) | ||
| nodes := append([]string(nil), GlobalArgs.Nodes...) | ||
| fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\n", configFile, GlobalArgs.Nodes, GlobalArgs.Endpoints) |
There was a problem hiding this comment.
print uses GlobalArgs.Nodes while a snapshot exists two lines up
nodes := append([]string(nil), GlobalArgs.Nodes...)
fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\n", configFile, GlobalArgs.Nodes, GlobalArgs.Endpoints)The defensive-copy nodes is captured for stable per-iteration reuse, then the print line below it reads GlobalArgs.Nodes directly. Functionally identical here (nothing mutates the global between the snapshot and the printf), but it's inconsistent and one inadvertent reorder away from a confusing diagnostic. Suggest using nodes for the print:
nodes := append([]string(nil), GlobalArgs.Nodes...)
fmt.Printf("- talm: file=%s, nodes=%s, endpoints=%s\n", configFile, nodes, GlobalArgs.Endpoints)Separately: when the auth path falls back to talosconfig nodes (CLI nodes empty, modeline empty), this line prints nodes=[] even though the apply will use the talosconfig context's nodes. This is unchanged from pre-PR behavior, so not a regression — but worth a follow-up to print after resolveAuthTemplateNodes resolves.
There was a problem hiding this comment.
Switched the printf to use the snapshotted nodes in 6d1339b. Agreed it was one inadvertent reorder away from a confusing diagnostic.
Re: the post-resolveAuthTemplateNodes print — kept that out of scope as you suggested; the current line still shows what the user asked for (CLI/modeline) before the talosconfig fallback resolves, which is the more useful signal when the print fires before the fallback. Happy to fold a "resolved nodes" follow-up print into a separate PR if you'd rather see both.
| func openClientPerNodeMaintenance(fingerprints []string, mkClient maintenanceClientFunc) openClientFunc { | ||
| return func(node string, action func(ctx context.Context, c *client.Client) error) error { | ||
| savedNodes := append([]string(nil), GlobalArgs.Nodes...) | ||
| GlobalArgs.Nodes = []string{node} | ||
| defer func() { GlobalArgs.Nodes = savedNodes }() | ||
| return mkClient(fingerprints, action) | ||
| } | ||
| } |
There was a problem hiding this comment.
global-state mutation as the channel into WithClientMaintenance
func openClientPerNodeMaintenance(fingerprints []string, mkClient maintenanceClientFunc) openClientFunc {
return func(node string, action func(ctx context.Context, c *client.Client) error) error {
savedNodes := append([]string(nil), GlobalArgs.Nodes...)
GlobalArgs.Nodes = []string{node}
defer func() { GlobalArgs.Nodes = savedNodes }()
return mkClient(fingerprints, action)
}
}The narrow-and-restore pattern works (defer fires on panic and on mkClient returning an error), and the comment explains why. Two notes:
- This couples correctness to
WithClientMaintenancereadingGlobalArgs.Nodessynchronously insidemkClient. It does today (pkg/talosctl/.../global/client.gobuilds the client withclient.WithEndpoints(c.Nodes...)before callingaction), but a future Talos bump that defers endpoint resolution (or moves it onto a goroutine) would silently break this. Worth a TODO or, ideally, an upstreamWithClientMaintenanceFor(endpoints []string, ...)overload that takes endpoints as a parameter. TestOpenClientPerNodeMaintenance_NarrowsAndRestoresGlobalNodesonly exercises the success path. A one-liner asserting thatGlobalArgs.Nodesis also restored whenactionreturns an error would harden this:
t.Run("restores GlobalArgs.Nodes when action errors", func(t *testing.T) {
GlobalArgs.Nodes = []string{"orig"}
defer func() { GlobalArgs.Nodes = nil }()
openClient := openClientPerNodeMaintenance(nil, fakeMaintenance)
_ = openClient("10.0.0.1", func(_ context.Context, _ *client.Client) error {
return fmt.Errorf("boom")
})
if !slices.Equal(GlobalArgs.Nodes, []string{"orig"}) {
t.Errorf("GlobalArgs.Nodes not restored after error path: got %v", GlobalArgs.Nodes)
}
})Neither point blocks merge.
There was a problem hiding this comment.
Both notes addressed.
Note 1 — added a paragraph to openClientPerNodeMaintenance's package doc in 6d1339b spelling out the synchronous-read assumption: the narrowing only works because mkClient reads GlobalArgs.Nodes before action returns. A future Talos refactor that defers endpoint resolution onto a goroutine would silently break this. Took the doc-it-now path rather than chase an upstream WithClientMaintenanceFor(endpoints, ...) overload in this PR.
Note 2 — added TestOpenClientPerNodeMaintenance_RestoresGlobalNodesOnError in 6403d1e, mirroring the t.Run sketch you wrote out. Pinning the error path next to the success path means a future swap of the defer for an explicit "restore after success" block would regress audibly.
| for _, node := range nodes { | ||
| if err := openClient(node, func(ctx context.Context, c *client.Client) error { | ||
| return renderMergeAndApply(ctx, c, opts, configFile, render, apply) | ||
| }); err != nil { | ||
| return fmt.Errorf("node %s: %w", node, err) | ||
| } | ||
| } |
There was a problem hiding this comment.
fail-fast on first failing node is a behavioral change
for _, node := range nodes {
if err := openClient(node, func(ctx context.Context, c *client.Client) error {
return renderMergeAndApply(ctx, c, opts, configFile, render, apply)
}); err != nil {
return fmt.Errorf("node %s: %w", node, err)
}
}Previously, batched ApplyConfiguration via apid would attempt every node and surface aggregated per-node results through helpers.PrintApplyResults. The new loop returns on the first failure, so node C is never attempted if node B fails. For a multi-control-plane bootstrap that's probably a footgun: a transient failure on node 2 leaves nodes 3..N unconfigured and the user has to figure out which subset retried. Two reasonable options, in order of effort:
- Document this in the release notes as a deliberate trade-off (the alternative — collect-all-errors — masks the first failure under a multi-error wrapper that's harder to read in the common case).
- Continue past per-node failures, accumulate via
multierror, print per-node status as you go, and return the aggregate at the end.
Not blocking — flag for the maintainer's call.
There was a problem hiding this comment.
Going with fail-fast as the deliberate behavior, and I'll spell that out in the PR description rather than rely on the reader to infer it from the loop. The reasoning: for a multi-control-plane bootstrap, continuing past a node-2 failure to apply node-3 risks driving the cluster's quorum into a worse state than stopping at the first failure does — the operator has to retry node 2 either way, but a partial-but-uniform application is easier to reason about than a partial-and-skipping one. The aggregate-and-continue alternative is reasonable for a talm apply --best-effort-style flag, which is a clean follow-up: the loop here would gain a multierror accumulator behind a flag default-false, leaving today's fail-fast as the default. Filed mentally; will open if needed.
| for _, line := range bytes.Split(data, []byte("\n")) { | ||
| trimmed := bytes.TrimSpace(line) | ||
| if len(trimmed) == 0 { | ||
| continue | ||
| } | ||
| if trimmed[0] == '#' { | ||
| continue | ||
| } | ||
| if string(trimmed) == "---" || string(trimmed) == "..." { | ||
| continue | ||
| } | ||
| return false | ||
| } | ||
| return true | ||
| } |
There was a problem hiding this comment.
line-based isEffectivelyEmptyYAML over a YAML decoder
func isEffectivelyEmptyYAML(data []byte) bool {
for _, line := range bytes.Split(data, []byte("\n")) {
trimmed := bytes.TrimSpace(line)
if len(trimmed) == 0 {
continue
}
if trimmed[0] == '#' {
continue
}
if string(trimmed) == "---" || string(trimmed) == "..." {
continue
}
return false
}
return true
}Two correctness edge cases worth thinking about:
- Inline comments after a key. A line like
machine: {} # inlinewould be classified as non-empty (correct). But a line like# top-level\nkey: # trailing— thekey:line has a non-#first character, so it correctly flags non-empty. So this edge is fine. - Indented
---. A document separator must be at column 0 per spec.string(trimmed) == "---"strips leading whitespace before comparing, so an indented---(which would actually be a YAML scalar inside a parent mapping) would be misclassified as a separator. In practice nobody writes that, but the comment doc says "comments, document separators, and whitespace" — strictly, this is broader than that.
The simpler alternative: yaml.NewDecoder over data and call Decode until io.EOF; if every document yields a nil or empty Node, return true. That sidesteps both edges and avoids re-implementing a YAML parser. The existing implementation passes the table-driven test, so the bar to switch is low; mention this as a follow-up if you don't want to churn.
There was a problem hiding this comment.
Fixed the indented-separator edge in c5555c2 — separator comparison now uses bytes.TrimRight (trailing whitespace only) rather than bytes.TrimSpace, so an indented --- no longer matches. Added an "indented separator counts as overlay" case to TestNodeFileHasOverlay.
Stuck with the line-based scan over yaml.NewDecoder for now: the function is small, the existing table tests cover the contract, and a real YAML decoder pulls in a heavier dependency for a path that runs once per apply. The edge you flagged was the genuinely incorrect case; the rest is in the "nobody writes that" zone.
| func MergeFileAsPatch(rendered []byte, patchFile string) ([]byte, error) { | ||
| patchBytes, err := os.ReadFile(patchFile) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("reading patch %s: %w", patchFile, err) | ||
| } | ||
| if isEffectivelyEmptyYAML(patchBytes) { | ||
| return rendered, nil | ||
| } | ||
| patch, err := configpatcher.LoadPatch(patchBytes) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("loading patch from %s: %w", patchFile, err) | ||
| } | ||
| out, err := configpatcher.Apply(configpatcher.WithBytes(rendered), []configpatcher.Patch{patch}) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("applying patch from %s: %w", patchFile, err) | ||
| } | ||
| merged, err := out.Bytes() | ||
| if err != nil { | ||
| return nil, fmt.Errorf("encoding merged config from %s: %w", patchFile, err) | ||
| } | ||
| return merged, nil | ||
| } |
There was a problem hiding this comment.
MergeFileAsPatch re-reads patchFile that was already read by processModelineAndUpdateGlobals
The apply path reads each configFile at least three times during one iteration: once in processModelineAndUpdateGlobals (modeline parse), once in engine.NodeFileHasOverlay (multi-node guard), and once here in engine.MergeFileAsPatch. For a normal-sized node file it doesn't matter, but the guard's read could be folded into MergeFileAsPatch's — e.g., have MergeFileAsPatch return (merged, hasOverlay, err) and let the caller use the hasOverlay signal for the multi-node guard before calling apply. Pure cleanup; not blocking.
There was a problem hiding this comment.
Pure cleanup as you say. Declining to fold this into the PR: changing MergeFileAsPatch's signature from (merged, error) to (merged, hasOverlay, error) widens the API for every caller, and the redundant read costs ~µs on a node-file-sized input. Happy to revisit if a profile shows it pays off, but absent that, keeping the small, single-purpose function. Filed mentally as a follow-up signal rather than work.
| func fakeAuthOpenClient(parentCtx context.Context) openClientFunc { | ||
| return func(node string, action func(ctx context.Context, c *client.Client) error) error { | ||
| return action(client.WithNode(parentCtx, node), nil) | ||
| } | ||
| } |
There was a problem hiding this comment.
fakeAuthOpenClient passes a nil *client.Client
func fakeAuthOpenClient(parentCtx context.Context) openClientFunc {
return func(node string, action func(ctx context.Context, c *client.Client) error) error {
return action(client.WithNode(parentCtx, node), nil)
}
}The fake hands action a nil *client.Client. applyTemplatesPerNode doesn't dereference the client itself, but if a future change inside the loop body started calling something on c (e.g., c.GetConfigContext() for some new fallback), every fakeAuthOpenClient-driven test would panic. Consider returning a zero-value sentinel client or changing action's signature in tests to accept a non-nil interface stub. Minor.
There was a problem hiding this comment.
Documented the contract in 6403d1e rather than swap to a zero-value &client.Client{}. Reasoning: the nil here is deliberate — if a future change inside applyTemplatesPerNode starts dereferencing c, every fake-driven test panics in CI immediately. A zero-value would change "panic on call" into a different, possibly silent failure mode (the talos client struct has internal state that zero-init does not satisfy). The comment now spells this out so the design intent is discoverable.
| talm template -f nodes/node1.yaml -I | ||
| ``` | ||
|
|
||
| > **Per-node patches inside node files.** A node file can carry Talos config | ||
| > below its modeline (for example, a custom `hostname`, secondary | ||
| > interfaces with `deviceSelector`, VIP placement, or extra etcd args). | ||
| > When `talm apply -f node.yaml` runs the template-rendering branch, that | ||
| > body is applied as a strategic merge patch on top of the rendered | ||
| > template before the result is sent to the node — so per-node fields | ||
| > survive even when the template auto-generates conflicting values | ||
| > (e.g. `hostname: talos-XXXXX`). | ||
| > | ||
| > **Talos v1.12+ caveat.** The multi-document output format introduced | ||
| > in v1.12 splits network configuration into typed documents | ||
| > (`LinkConfig`, `BondConfig`, `VLANConfig`, `Layer2VIPConfig`, | ||
| > `HostnameConfig`, `ResolverConfig`). Legacy node-body fields under | ||
| > `machine.network.interfaces` have no safe 1:1 mapping to those types, | ||
| > so the multi-doc path does not translate them — if you target a | ||
| > v1.12+ Talos node, pin per-node network settings by patching the | ||
| > typed resources (e.g. a `LinkConfig` document below the modeline) | ||
| > rather than legacy `machine.network.interfaces`. Fields outside the | ||
| > network area (`machine.network.hostname` via `HostnameConfig`, | ||
| > `machine.install.disk`, extra etcd args, etc.) still merge as | ||
| > expected. | ||
| > | ||
| > **One body, one node.** A non-empty body is a per-node pin, so the | ||
| > modeline for that file must target exactly one node. `talm apply` | ||
| > refuses a multi-node modeline when the body is non-empty; modeline- | ||
| > only files (no body) are still allowed and drive the same rendered | ||
| > template on every listed target. | ||
| > | ||
| > `talm template -f node.yaml` (with or without `-I`) does **not** apply | ||
| > the same overlay: its output is the rendered template plus the modeline | ||
| > and the auto-generated warning, byte-identical to what the template | ||
| > alone would produce. Routing it through the patcher would drop every | ||
| > YAML comment (including the modeline) and re-sort keys, breaking | ||
| > downstream commands that read the file back. Use `apply --dry-run` if | ||
| > you want to preview the exact bytes that will be sent to the node. | ||
|
|
||
| ## Using talosctl commands | ||
|
|
There was a problem hiding this comment.
multi-paragraph blockquote
Document looks great; one nit: the four > **bold lead.** ... paragraphs are easy to skim on GitHub but, in a terminal, the > prefix without a blank line between paragraphs renders as one giant quote. Adding > blank lines (already present) is fine; mainly flagging that the structure is dense and could land as a nested bullet list under one > **Per-node patches inside node files.** heading.
Not actionable; just an observation.
There was a problem hiding this comment.
Noted — agreed the four bold-lead paragraphs are easier to skim on GitHub than in a terminal. Will refactor as a nested bullet list under one > **Per-node patches inside node files.** heading next time I touch this section. Leaving the current shape this PR.
The print line below the snapshot was reading GlobalArgs.Nodes directly instead of the freshly captured 'nodes' slice. Functionally identical today (nothing mutates the global between the snapshot and the printf), but inconsistent and one inadvertent reorder away from a confusing diagnostic. Switch to the snapshot. Also pin the synchronous-read assumption for openClientPerNodeMaintenance: the GlobalArgs.Nodes narrowing only works because mkClient (WithClientMaintenance in production) reads the global before the action returns. A future Talos refactor that defers endpoint resolution onto a goroutine would silently break this; the package doc now spells out the contract so the failure mode is at least discoverable. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Add TestOpenClientPerNodeMaintenance_RestoresGlobalNodesOnError as the companion to the success-path test. Without this, a future refactor that swaps the defer-restore for a 'restore after success' block would silently regress the error path. Document fakeAuthOpenClient's nil *client.Client contract: the nil is deliberate so any future change that starts dereferencing the client surfaces as a CI panic rather than as silent test coverage of an untouched code path. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Previously the function ran bytes.TrimSpace on each line before comparing against "---"/"...", which means an indented " ---" (a YAML scalar inside a parent mapping, not a document separator) would be treated as a separator and hide a real overlay. Compare against the line minus only trailing whitespace instead — separators must be at column 0 per the YAML spec. Comments and blank lines still use the fully trimmed form: a comment can be indented, an empty line is empty regardless of where in the file it appears. Add a regression case to TestNodeFileHasOverlay covering the indented-separator edge. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Summary
Five issues clustered in
pkg/commands/apply.goandpkg/engine/engine.go. All were either introduced or surfaced by PR #119 (online template rendering in apply). Bundled because they share the same callback structure and overlap heavily.Per-node template rendering in
talm apply. PR fix(commands): render templates online in apply to resolve lookups #119 moved rendering inside the Talos client callback solookup()resolves real discovery data, but the callback batched all--nodes A,B,C(or modelinenodes=[A,B,C]) into a single gRPC context.engine.Renderthen hithelpers.FailIfMultiNodesand bailed out before any rendering happened. Render once per node instead, each iteration carrying a single-nodeclient.WithNodecontext so the guard passes and discovery resolves per node. In insecure (maintenance) mode each iteration opens a fresh single-endpoint maintenance client becauseWithClientMaintenancewould otherwise dial every endpoint at once and gRPC would round-robinApplyConfigurationacross them — most nodes would never receive the config.Node file is now applied as a patch over the rendered template. When a modeline contained
templates=[...],talm apply -f node.yamlrendered the listed templates and discarded the rest of the node file. Per-node hostname, secondary interfaces withdeviceSelector, VIP placement, etcd extra args — all silently lost. Newengine.MergeFileAsPatchoverlays the body using the Talos config-patcher; modeline-only and effectively-empty files short-circuit before reaching the patcher (otherwise the patcher routes empty input through JSON6902 and rejects any multi-document rendered config — the v1.12+ default output format).talm templateintentionally does not call this overlay because the patcher round-trips through its config loader, dropping the modeline and YAML comments downstream commands rely on; the README documents the divergence.Multi-node error message references the actual subcommand.
helpers.FailIfMultiNodeswas hardcoded to"talm template". After fix(commands): render templates online in apply to resolve lookups #119, users runningtalm applysaw an error talking abouttalm template. ThreadedCommandNamethroughengine.Options; defaults to"talm"when empty.Symmetric error handling in the apply per-file loop. The template-render branch assigned
err = withApplyClient(...)and relied on an outerif err != nil; the direct-patch branch already used local-scopeif err := ...; err != nil { return err }. Two idioms, one outer check that only caught one branch — the same shadowing class that already bit the project once. Both branches now use the local-scope return; the outer check is gone.Removed dead
engine.Options.Insecure. The field has been declared since the file was first added (May 2024) and was never read byRenderor any helper. The runtime guarantee —--insecure(maintenance mode) bypassesFailIfMultiNodes— comes fromWithClientMaintenancenot injecting nodes into the gRPC context, not from anything in the engine.Verification
TestRenderFailIfMultiNodes_UsesCommandName— three sub-tests asserting theCommandNamevalue surfaces in the error and the empty-string fallback.TestMergeFileAsPatch— overlay (single-doc + non-empty patch), modeline-only byte-identity, multi-doc + modeline-only (the regression vector), multi-doc + non-empty strategic-merge patch (the realistic v1.12+ scenario), empty file, comments-only file.TestApplyTemplatesPerNode_*— per-node single-node ctx; never-batches; empty-nodes errors loudly; maintenance mode opens a fresh client per iteration; auth mode writes the single-targetnodemetadata key.TestOpenClientPerNodeMaintenance_NarrowsAndRestoresGlobalNodes— drives the production opener with an injectedmaintenanceClientFuncfake that snapshotsGlobalArgs.Nodesat the momentWithClientMaintenancereads it.TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation— pins the intentional template/apply divergence so a future "make template match apply" change is caught.TestBuildApplyRenderOptions— pins that the apply caller actually setsCommandName(independent of the engine-side coverage).golangci-lint run ./...is clean.Closes
Summary by CodeRabbit
Release Notes
New Features
talm apply, preserving per-node configurations even when templates generate conflicting values.talm apply --dry-runto preview exact bytes sent to nodes.Documentation
talm applyandtalm templatecommands.